Tip: Welcome to the Investigate a Dataset project! You will find tips in quoted sections like this to help organize your approach to your investigation. Once you complete this project, remove these Tip sections from your report before submission. First things first, you might want to double-click this Markdown cell and change the title so that it reflects your dataset and investigation.

Project: Investigate a Dataset - [Dataset-name]

Table of Contents

Introduction

Dataset Description

Tip: In this section of the report, provide a brief introduction to the dataset you've selected/downloaded for analysis. Read through the description available on the homepage-links present here. List all column names in each table, and their significance. In case of multiple tables, describe the relationship between tables.

Question(s) for Analysis

Tip: Clearly state one or more questions that you plan on exploring over the course of the report. You will address these questions in the data analysis and conclusion sections. Try to build your report around the analysis of at least one dependent variable and three independent variables. If you're not sure what questions to ask, then make sure you familiarize yourself with the dataset, its variables and the dataset context for ideas of what to explore.

Tip: Once you start coding, use NumPy arrays, Pandas Series, and DataFrames where appropriate rather than Python lists and dictionaries. Also, use good coding practices, such as, define and use functions to avoid repetitive code. Use appropriate comments within the code cells, explanation in the mark-down cells, and meaningful variable names.

In [28]:
# Use this cell to set up import statements for all of the packages that you
#   plan to use.

# Remember to include a 'magic word' so that your visualizations are plotted
#   inline with the notebook. See this page for more:
#   http://ipython.readthedocs.io/en/stable/interactive/magics.html


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import re

% matplotlib inline
In [29]:
# Upgrade pandas to use dataframe.explode() function. 
#!pip install --upgrade pandas==1.1.5

Data Wrangling

Tip: In this section of the report, you will load in the data, check for cleanliness, and then trim and clean your dataset for analysis. Make sure that you document your data cleaning steps in mark-down cells precisely and justify your cleaning decisions.

General Properties

Tip: You should not perform too many operations in each cell. Create cells freely to explore your data. One option that you can take with this project is to do a lot of explorations in an initial notebook. These don't have to be organized, but make sure you use enough comments to understand the purpose of each code cell. Then, after you're done with your analysis, create a duplicate notebook where you will trim the excess and organize your steps so that you have a flowing, cohesive report.

Data selection

For this project I will use data from Gapminder (https://www.gapminder.org/data/). For my analysis, I have specifically selected the following indicators:

  • Income per person (GDP/capita, PPP$ inflation-adjusted): Gross domestic product per person adjusted for differences in purchasing power (in international dollars, fixed 2011 prices, PPPbased on 2011 ICP).
  • Life expectancy (years): The average number of years a newborn would live if current mortality patterns were to stay the same.
  • Life expectancy, male: Life expectancy at birth for males
  • Life expectancy, female: Life expectancy at birth for females

Range of years and countries of interest

Gapminder incorporates data from the 1950s, with projections for up 2099 for about 200 countries (some countries are excluded from certain indicators). This is a huge amount of data to analyse.

I therefore chose to limit the range of data for my analysis to historic data for the past five full decades (i.e. 1971 to 2020), as well as consider only countries in my current geography, the Southern African Development Community (SADC). I believe this focus on a specific data set of interest will allow me to draw more impactful insights than if I had gone with a general view of the entire data set.

SADC is a grouping of 16 countries in the southern most part of Africa.

In [30]:
#Defining the columns that I want to use across all the datasets
select_cols = ['country', '1971', '1972', '1973', '1974', '1975', '1976', '1977',
       '1978', '1979', '1980', '1981', '1982', '1983', '1984', '1985', '1986',
       '1987', '1988', '1989', '1990', '1991', '1992', '1993', '1994', '1995',
       '1996', '1997', '1998', '1999', '2000', '2001', '2002', '2003', '2004',
       '2005', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013',
       '2014', '2015', '2016', '2017', '2018', '2019', '2020']
In [31]:
#Defining the countries that I want to use across all the datasets
sadc_countries = ['Angola', 'Botswana', 'Comoros', 'Congo, Dem. Rep.', 'Eswatini', 'Lesotho', 'Madagascar', 'Malawi', 'Mauritius', 'Mozambique', 'Namibia', 'Seychelles', 'South Africa', 'Tanzania', 'Zambia', 'Zimbabwe']
In [32]:
#Creating a list of only the year columns
years = ['1971', '1972', '1973', '1974', '1975', '1976', '1977','1978', '1979', '1980', '1981', '1982', '1983', '1984', '1985', '1986','1987', '1988', '1989', '1990', '1991', '1992', '1993', '1994', '1995','1996', '1997', '1998', '1999', '2000', '2001', '2002', '2003', '2004','2005', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013','2014', '2015', '2016', '2017', '2018', '2019', '2020']

Loading and subsetting the data

This section loads the data from the forur dataframes listed above by reading from the csv files downloaded from the Gapmider website.

The loaded data is then subset for the years and countries of interest.

Finally a quick look at the data to confirm that the loading and subseting were successful will be done by checking the number of rows and columns (shape) for each dataframe as well as showing the first few rows of each dataframe.

In [33]:
#Loading and delimiting the Income per person (GDP/capita, PPP$ inflation-adjusted) data
income_df = pd.read_csv('income_per_person_gdppercapita_ppp_inflation_adjusted.csv')
income_df = income_df[select_cols]
income_df = income_df[income_df.country.isin(sadc_countries)].reset_index(drop=True)
print(income_df.shape)
income_df.head()
(16, 51)
Out[33]:
country 1971 1972 1973 1974 1975 1976 1977 1978 1979 ... 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
0 Angola 3200 3170 3320 3170 1990 1770 1740 1780 1780 ... 7680 8040 8140 8240 8040 7570 7310 6930 6670 6120
1 Botswana 1700 2080 2440 2570 2680 2860 3040 3260 3580 ... 13.7k 14.2k 15.6k 16k 14.9k 15.7k 15.9k 16.2k 16.3k 14.6k
2 Congo, Dem. Rep. 2760 2700 2850 2860 2650 2440 2400 2210 2160 ... 895 927 972 1030 1070 1060 1060 1090 1100 1080
3 Comoros 1890 1920 2030 2070 2180 2160 2100 2140 2290 ... 2930 2950 3010 3000 2960 2990 3030 3070 3060 2970
4 Lesotho 693 808 997 1030 924 1050 1200 1400 1210 ... 2450 2590 2610 2640 2700 2780 2670 2620 2580 2410

5 rows × 51 columns

In [34]:
#Loading and delimiting the Life expectancy (years) data
life_expectancy_all_df = pd.read_csv('life_expectancy_years.csv')
life_expectancy_all_df = life_expectancy_all_df[select_cols]
life_expectancy_all_df = life_expectancy_all_df[life_expectancy_all_df.country.isin(sadc_countries)].reset_index(drop=True)
print(life_expectancy_all_df.shape)
life_expectancy_all_df.head()
(16, 51)
Out[34]:
country 1971 1972 1973 1974 1975 1976 1977 1978 1979 ... 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
0 Angola 46.8 47.0 47.2 47.4 47.5 47.5 47.7 47.8 48.0 ... 60.8 61.4 62.1 63.0 63.5 63.9 64.2 64.6 65.1 65.2
1 Botswana 57.0 57.5 58.1 58.6 59.1 59.6 59.9 60.3 60.6 ... 57.8 58.6 59.4 60.1 60.6 61.2 61.5 61.8 62.3 61.6
2 Congo, Dem. Rep. 49.9 50.0 50.3 50.7 50.9 51.1 51.3 51.5 51.8 ... 59.4 60.1 60.9 61.8 62.6 63.3 63.9 64.7 65.0 65.2
3 Comoros 51.7 52.0 52.3 52.5 52.9 53.2 53.5 53.8 54.1 ... 65.7 66.3 66.7 67.2 67.5 67.9 68.2 68.5 68.7 68.8
4 Lesotho 54.7 55.0 55.4 55.8 56.2 56.6 57.1 57.7 58.1 ... 48.2 47.9 47.9 47.9 48.5 49.6 50.8 51.4 51.8 52.0

5 rows × 51 columns

In [35]:
#Loading and delimiting the Life expectancy, male data
life_expectancy_male_df = pd.read_csv('life_expectancy_male.csv')
life_expectancy_male_df = life_expectancy_male_df[select_cols]
life_expectancy_male_df = life_expectancy_male_df[life_expectancy_male_df.country.isin(sadc_countries)].reset_index(drop=True)
print(life_expectancy_male_df.shape)
life_expectancy_male_df.head()
(16, 51)
Out[35]:
country 1971 1972 1973 1974 1975 1976 1977 1978 1979 ... 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
0 Angola 40.0 40.4 40.7 41.1 41.4 41.7 42.0 42.3 42.5 ... 53.8 54.7 55.4 56.1 56.7 57.2 57.7 58.1 58.4 58.7
1 Botswana 51.2 51.7 52.2 52.8 53.3 53.9 54.4 54.9 55.4 ... 59.4 60.9 62.3 63.5 64.5 65.2 65.8 66.2 66.5 66.7
2 Congo, Dem. Rep. 42.8 43.1 43.4 43.6 43.8 44.0 44.2 44.4 44.6 ... 56.0 56.5 56.9 57.4 57.8 58.2 58.5 58.9 59.1 59.4
3 Comoros 44.5 44.9 45.3 45.8 46.2 46.7 47.2 47.8 48.4 ... 60.6 61.0 61.3 61.6 61.8 62.0 62.2 62.4 62.6 62.8
4 Lesotho 46.1 46.7 47.2 47.8 48.4 49.1 49.8 50.5 51.1 ... 43.4 44.5 45.7 46.9 48.0 49.0 49.8 50.6 51.2 51.7

5 rows × 51 columns

In [36]:
#Loading and delimiting the Life expectancy, female data
life_expectancy_female_df = pd.read_csv('life_expectancy_female.csv')
life_expectancy_female_df = life_expectancy_female_df[select_cols]
life_expectancy_female_df = life_expectancy_female_df[life_expectancy_female_df.country.isin(sadc_countries)].reset_index(drop=True)
print(life_expectancy_female_df.shape)
life_expectancy_female_df.head()
(16, 51)
Out[36]:
country 1971 1972 1973 1974 1975 1976 1977 1978 1979 ... 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
0 Angola 42.7 43.1 43.4 43.8 44.2 44.5 44.9 45.2 45.5 ... 59.1 60.0 60.8 61.6 62.2 62.8 63.3 63.7 64.0 64.4
1 Botswana 57.5 58.0 58.6 59.1 59.7 60.3 60.8 61.4 61.9 ... 64.3 66.0 67.5 68.9 70.0 70.9 71.6 72.0 72.4 72.6
2 Congo, Dem. Rep. 45.6 45.9 46.2 46.4 46.6 46.8 47.0 47.2 47.5 ... 58.9 59.3 59.8 60.3 60.7 61.1 61.5 61.9 62.2 62.5
3 Comoros 47.8 48.3 48.7 49.2 49.6 50.1 50.6 51.1 51.7 ... 63.9 64.3 64.6 64.9 65.2 65.4 65.7 65.9 66.1 66.3
4 Lesotho 55.0 55.5 56.1 56.6 57.3 57.9 58.6 59.3 59.9 ... 49.3 50.5 51.8 53.1 54.3 55.3 56.2 57.0 57.6 58.1

5 rows × 51 columns

Exploratory Data Analysis

Tip: Now that you've trimmed and cleaned your data, you're ready to move on to exploration. Compute statistics and create visualizations with the goal of addressing the research questions that you posed in the Introduction section. You should compute the relevant statistics throughout the analysis when an inference is made about the data. Note that at least two or more kinds of plots should be created as part of the exploration, and you must compare and show trends in the varied visualizations.

Tip: - Investigate the stated question(s) from multiple angles. It is recommended that you be systematic with your approach. Look at one variable at a time, and then follow it up by looking at relationships between variables. You should explore at least three variables in relation to the primary question. This can be an exploratory relationship between three variables of interest, or looking at how two independent variables relate to a single dependent variable of interest. Lastly, you should perform both single-variable (1d) and multiple-variable (2d) explorations.

The exploratory data analysis for each dataframe is a repetitive process with the following steps:

  1. Checking for null values
  2. Checking the data types for each column in the dataframe
  3. A look at the descriptive statistics of the numerical columns of the dataframe
  4. Plotting a line graph of the combined averages (mean) of all the SADC countries over the time series
  5. Plotting a box plot for all the SADC countries over the time series. The box plot shows additional information on how skewed the data is for each particular year, as well as expose any outliers.
  6. Plot a histograms for all the SADC countries over the time series. Histograms reveal additional perspectives on the data as they show the skew of the data as well as the relative size of each grouping (i.e., number of countries in each grouping)

Income_df

Initial data cleaning: Because the income_df data used 'k' to denote thousands, i need to do some initial data cleaning to replace the 'k' with numeric thousands (000) so that I can explore the income_data data as numberic

In [37]:
#Creating a list of only the year columns
years = ['1971', '1972', '1973', '1974', '1975', '1976', '1977','1978', '1979', '1980', '1981', '1982', '1983', '1984', '1985', '1986','1987', '1988', '1989', '1990', '1991', '1992', '1993', '1994', '1995','1996', '1997', '1998', '1999', '2000', '2001', '2002', '2003', '2004','2005', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013','2014', '2015', '2016', '2017', '2018', '2019', '2020']
In [38]:
#replacing 'k' with numeric 000's
replace_k = income_df[years].replace({'k': 'e+03'}, regex=True).astype(float)
In [39]:
#Updating the income_df dataframe with the numeric years columns
income_df.drop(years, axis=1, inplace=True)
income_df = pd.concat([income_df, replace_k], axis=1)
In [40]:
#Checking for null values in the income_df - there are none (0)
(income_df.isnull().sum() > 0).sum()
Out[40]:
0
In [41]:
#Confirming the data types of the income_df columns
income_df.dtypes
Out[41]:
country     object
1971       float64
1972       float64
1973       float64
1974       float64
1975       float64
1976       float64
1977       float64
1978       float64
1979       float64
1980       float64
1981       float64
1982       float64
1983       float64
1984       float64
1985       float64
1986       float64
1987       float64
1988       float64
1989       float64
1990       float64
1991       float64
1992       float64
1993       float64
1994       float64
1995       float64
1996       float64
1997       float64
1998       float64
1999       float64
2000       float64
2001       float64
2002       float64
2003       float64
2004       float64
2005       float64
2006       float64
2007       float64
2008       float64
2009       float64
2010       float64
2011       float64
2012       float64
2013       float64
2014       float64
2015       float64
2016       float64
2017       float64
2018       float64
2019       float64
2020       float64
dtype: object
In [42]:
#Plotting the average GDP per capita income for all SADC countries 
plt.figure(figsize=(16,4))
income_df.mean(axis=0).plot();
plt.title("SADC GDP per capita income over the years 1971 - 2020");
plt.ylabel("Income in $");
plt.xlabel("Years: 1971 - 2020");
In [43]:
#A look at the boxplot version of the average GDP per capita income for all SADC countries
#This views shows the skew of the data as well as any outliers
income_df.plot(figsize=(24,6), kind='box');
plt.title("SADC GDP per capita income over the years 1971 - 2020 - Boxplot");
In [44]:
#A look at the histogram version of the average GDP per capita income for all SADC countries
#this view shows the skew of the data as well as the relative size of each grouping (i.e., number of countries in each grouping)
fig, ax = plt.subplots(10, 5, figsize=(16, 30))
income_df.hist(ax=ax);
In [45]:
#A look at the descriptive statistics of the dataset
income_df.describe()
Out[45]:
1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 ... 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
count 16.000000 16.000000 16.000000 16.00000 16.000000 16.000000 16.00000 16.000000 16.000000 16.000000 ... 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.00000 16.000000 16.000000 16.00000
mean 3230.437500 3319.875000 3456.812500 3563.25000 3525.187500 3661.562500 3663.68750 3738.687500 3854.500000 3854.437500 ... 6973.437500 7166.687500 7425.750000 7578.750000 7621.875000 7731.250000 7833.12500 7908.750000 7910.625000 7208.75000
std 2809.153181 2772.289592 2837.811251 2929.69276 2949.927981 3077.446842 3081.62515 3172.865933 3374.146213 3369.555717 ... 6800.986869 6886.139574 7197.238243 7387.571432 7494.524418 7732.834646 7968.68347 8103.400829 8205.719525 7292.43615
min 507.000000 520.000000 532.000000 552.00000 549.000000 565.000000 549.00000 539.000000 542.000000 551.000000 ... 895.000000 927.000000 972.000000 1030.000000 1070.000000 1060.000000 1060.00000 1090.000000 1100.000000 1080.00000
25% 1577.500000 1755.000000 1840.000000 1872.50000 1822.500000 1675.000000 1655.00000 1687.500000 1685.000000 1682.500000 ... 1960.000000 1982.500000 2040.000000 2102.500000 2150.000000 2222.500000 2292.50000 2340.000000 2340.000000 2185.00000
50% 2395.000000 2445.000000 2405.000000 2645.00000 2515.000000 2460.000000 2360.00000 2250.000000 2305.000000 2360.000000 ... 3200.000000 3510.000000 3540.000000 3575.000000 3575.000000 3575.000000 3645.00000 3720.000000 3550.000000 3320.00000
75% 3590.000000 3747.500000 3917.500000 4020.00000 4055.000000 4592.500000 4595.00000 4562.500000 4530.000000 4385.000000 ... 10422.500000 10697.500000 11017.500000 11300.000000 11525.000000 11350.000000 11125.00000 11050.000000 10782.500000 9765.00000
max 11200.000000 11100.000000 11300.000000 11700.00000 11700.000000 11700.000000 11400.00000 11500.000000 11700.000000 12100.000000 ... 23200.000000 23300.000000 24200.000000 24900.000000 25600.000000 26400.000000 27300.00000 27500.000000 27600.000000 25300.00000

8 rows × 50 columns

life_expectancy_all_df

In [46]:
#Checking the number of null values
(life_expectancy_all_df.isnull().sum() > 0).sum()
Out[46]:
0
In [47]:
#Confirming the data types of each column in the dataframe
life_expectancy_all_df.dtypes
Out[47]:
country     object
1971       float64
1972       float64
1973       float64
1974       float64
1975       float64
1976       float64
1977       float64
1978       float64
1979       float64
1980       float64
1981       float64
1982       float64
1983       float64
1984       float64
1985       float64
1986       float64
1987       float64
1988       float64
1989       float64
1990       float64
1991       float64
1992       float64
1993       float64
1994       float64
1995       float64
1996       float64
1997       float64
1998       float64
1999       float64
2000       float64
2001       float64
2002       float64
2003       float64
2004       float64
2005       float64
2006       float64
2007       float64
2008       float64
2009       float64
2010       float64
2011       float64
2012       float64
2013       float64
2014       float64
2015       float64
2016       float64
2017       float64
2018       float64
2019       float64
2020       float64
dtype: object
In [48]:
#A look at the descriptive statistics of the dataset
life_expectancy_all_df.describe()
Out[48]:
1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 ... 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
count 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 ... 16.000000 16.000000 16.000000 16.00000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000
mean 53.731250 54.125000 54.543750 55.018750 55.350000 55.718750 56.000000 56.293750 56.631250 57.112500 ... 59.881250 60.600000 61.300000 61.87500 62.406250 62.918750 63.431250 63.975000 64.412500 64.343750
std 6.257019 6.188538 6.126116 6.177833 6.017641 6.064456 6.005442 6.101417 6.069675 6.166022 ... 7.185932 7.063427 6.954519 6.75707 6.518023 6.204001 5.943144 5.780715 5.697002 5.735267
min 42.700000 43.500000 44.200000 44.900000 45.500000 46.200000 46.800000 47.500000 48.000000 48.100000 ... 48.200000 47.900000 47.900000 47.90000 48.500000 49.600000 50.800000 51.400000 51.800000 52.000000
25% 49.700000 49.950000 50.600000 51.225000 51.725000 52.225000 52.950000 53.225000 53.525000 53.900000 ... 56.725000 57.875000 58.850000 59.57500 60.100000 60.700000 61.100000 61.500000 61.975000 61.375000
50% 52.950000 53.450000 53.900000 54.300000 54.800000 55.300000 55.650000 55.900000 56.200000 56.450000 ... 59.050000 60.100000 61.100000 61.80000 62.450000 63.000000 63.650000 64.500000 65.050000 65.000000
75% 57.275000 57.700000 58.175000 58.600000 58.875000 58.925000 58.850000 59.100000 59.150000 59.725000 ... 62.150000 62.975000 63.550000 63.97500 64.425000 64.750000 65.100000 65.500000 65.925000 66.025000
max 66.400000 67.200000 67.800000 68.300000 68.900000 69.400000 69.800000 70.300000 70.800000 71.100000 ... 74.600000 74.700000 75.100000 75.00000 75.200000 75.200000 75.300000 75.300000 75.500000 75.500000

8 rows × 50 columns

In [49]:
#Plotting the average combined life expectancy for all SADC countries 
plt.figure(figsize=(16,4))
life_expectancy_all_df.mean(axis=0).plot();
plt.title("SADC combined average life expectancy over the years 1971 - 2020");
plt.ylabel("life expectancy in years");
plt.xlabel("Years: 1971 - 2020");
In [50]:
#A look at the boxplot version of the average combined life expectancy for all SADC countries
#This views shows the skew of the data as well as any outliers
life_expectancy_all_df.plot(figsize=(24,6), kind='box');
In [51]:
#A look at the histogram version of the average combined life expectancy for all SADC countries
#this view shows the skew of the data as well as the relative size of each grouping (i.e., number of countries in each grouping)
fig, ax = plt.subplots(10, 5, figsize=(16, 30))
life_expectancy_all_df.hist(ax=ax);

life_expectancy_male_df

In [52]:
#Checking the number of null values
(life_expectancy_male_df.isnull().sum() > 0).sum()
Out[52]:
0
In [53]:
#Confirming the data types of each column in the dataframe
life_expectancy_male_df.dtypes
Out[53]:
country     object
1971       float64
1972       float64
1973       float64
1974       float64
1975       float64
1976       float64
1977       float64
1978       float64
1979       float64
1980       float64
1981       float64
1982       float64
1983       float64
1984       float64
1985       float64
1986       float64
1987       float64
1988       float64
1989       float64
1990       float64
1991       float64
1992       float64
1993       float64
1994       float64
1995       float64
1996       float64
1997       float64
1998       float64
1999       float64
2000       float64
2001       float64
2002       float64
2003       float64
2004       float64
2005       float64
2006       float64
2007       float64
2008       float64
2009       float64
2010       float64
2011       float64
2012       float64
2013       float64
2014       float64
2015       float64
2016       float64
2017       float64
2018       float64
2019       float64
2020       float64
dtype: object
In [54]:
#A look at the descriptive statistics of the dataset
life_expectancy_male_df.describe()
Out[54]:
1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 ... 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
count 16.000000 16.000000 16.00000 16.000000 16.000000 16.000000 16.00000 16.000000 16.000000 16.000000 ... 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.00000 16.000000 16.000000 16.000000
mean 48.275000 48.662500 49.07500 49.506250 49.893750 50.318750 50.73125 51.143750 51.525000 51.906250 ... 56.325000 57.268750 58.162500 58.968750 59.693750 60.287500 60.80000 61.237500 61.581250 61.887500
std 6.942766 6.869437 6.83691 6.800732 6.787977 6.813537 6.84687 6.890764 6.953704 7.048638 ... 7.241961 6.896929 6.545927 6.206311 5.910157 5.627536 5.42697 5.223776 5.102577 4.999983
min 39.900000 40.400000 40.70000 41.100000 41.400000 41.700000 42.00000 42.300000 42.400000 42.400000 ... 43.400000 44.500000 45.700000 46.900000 48.000000 49.000000 49.80000 50.600000 51.200000 51.700000
25% 44.000000 44.375000 44.82500 45.250000 45.600000 45.950000 46.30000 46.650000 47.000000 47.300000 ... 53.125000 54.325000 55.325000 56.475000 57.525000 57.950000 58.30000 58.700000 58.925000 59.225000
50% 46.500000 47.100000 47.75000 48.400000 49.050000 49.800000 50.35000 50.800000 51.200000 51.600000 ... 55.900000 56.700000 57.600000 58.450000 59.150000 59.650000 60.20000 60.500000 60.750000 61.050000
75% 50.975000 51.475000 51.97500 52.575000 53.150000 53.750000 54.32500 54.825000 55.325000 55.900000 ... 59.700000 60.925000 61.550000 62.075000 62.350000 62.600000 63.12500 63.675000 64.050000 64.425000
max 63.100000 63.500000 64.00000 64.400000 64.800000 65.300000 65.70000 66.100000 66.400000 66.700000 ... 70.300000 70.500000 70.700000 70.900000 71.100000 71.200000 71.40000 71.500000 71.700000 71.800000

8 rows × 50 columns

In [55]:
#Plotting the average male life expectancy for all SADC countries 
plt.figure(figsize=(16,4))
life_expectancy_male_df.mean(axis=0).plot();
plt.title("SADC average life expectancy for males over the years 1971 - 2020");
plt.ylabel("life expectancy in years");
plt.xlabel("Years: 1971 - 2020");
In [56]:
#A look at the boxplot version of the average male life expectancy for all SADC countries
#This views shows the skew of the data as well as any outliers
life_expectancy_male_df.plot(figsize=(24,6), kind='box');

life_expectancy_female_df

In [57]:
#Checking the number of null values
(life_expectancy_female_df.isnull().sum() > 0).sum()
Out[57]:
0
In [58]:
#Confirming the data types of each column in the dataframe
life_expectancy_female_df.dtypes
Out[58]:
country     object
1971       float64
1972       float64
1973       float64
1974       float64
1975       float64
1976       float64
1977       float64
1978       float64
1979       float64
1980       float64
1981       float64
1982       float64
1983       float64
1984       float64
1985       float64
1986       float64
1987       float64
1988       float64
1989       float64
1990       float64
1991       float64
1992       float64
1993       float64
1994       float64
1995       float64
1996       float64
1997       float64
1998       float64
1999       float64
2000       float64
2001       float64
2002       float64
2003       float64
2004       float64
2005       float64
2006       float64
2007       float64
2008       float64
2009       float64
2010       float64
2011       float64
2012       float64
2013       float64
2014       float64
2015       float64
2016       float64
2017       float64
2018       float64
2019       float64
2020       float64
dtype: object
In [59]:
#A look at the descriptive statistics of the dataset
life_expectancy_female_df.describe()
Out[59]:
1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 ... 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
count 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 ... 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000 16.000000
mean 52.243750 52.687500 53.131250 53.581250 54.050000 54.500000 54.943750 55.362500 55.781250 56.168750 ... 61.175000 62.262500 63.281250 64.218750 65.025000 65.706250 66.275000 66.731250 67.125000 67.456250
std 8.414271 8.408874 8.419083 8.459095 8.515946 8.577101 8.635583 8.696196 8.765023 8.802554 ... 7.653801 7.245493 6.839709 6.464902 6.128458 5.861566 5.643935 5.434423 5.334854 5.232331
min 40.500000 41.100000 41.600000 42.100000 42.600000 43.100000 43.600000 44.100000 44.500000 44.600000 ... 49.300000 50.500000 51.800000 53.100000 54.300000 55.300000 56.200000 57.000000 57.600000 58.100000
25% 46.050000 46.500000 46.950000 47.300000 47.725000 48.075000 48.425000 48.775000 49.150000 49.450000 ... 58.050000 58.625000 59.450000 60.200000 60.925000 61.575000 62.650000 63.525000 63.925000 64.350000
50% 51.350000 51.900000 52.450000 53.000000 53.500000 54.050000 54.550000 54.950000 55.400000 55.800000 ... 60.100000 61.650000 62.950000 64.100000 64.950000 65.400000 65.850000 66.300000 66.700000 67.050000
75% 56.150000 56.650000 57.250000 57.900000 58.475000 59.000000 59.450000 59.900000 60.325000 60.825000 ... 64.000000 64.675000 65.000000 65.725000 66.350000 66.900000 67.300000 67.625000 67.950000 68.200000
max 70.600000 71.000000 71.300000 71.700000 72.100000 72.400000 72.700000 73.000000 73.300000 73.500000 ... 77.300000 77.500000 77.800000 77.900000 78.100000 78.200000 78.300000 78.300000 78.500000 78.600000

8 rows × 50 columns

In [60]:
#Plotting the average female life expectancy for all SADC countries 
plt.figure(figsize=(16,4))
life_expectancy_female_df.mean(axis=0).plot();
plt.title("SADC average life expectancy for females over the years 1971 - 2020");
plt.ylabel("life expectancy in years");
In [61]:
#A look at the boxplot version of the average female life expectancy for all SADC countries
#This views shows the skew of the data as well as any outliers
life_expectancy_female_df.plot(figsize=(24,6), kind='box');
In [62]:
#Histograms with a comparison of the male and female average life expectancies for all SADC countries for each year
#PINK = Female life expectancy histograms
#Light black/Grey = Maleale life expectancy histograms
fig, ax = plt.subplots(10, 5, figsize=(16, 30))
life_expectancy_male_df.hist(ax=ax, alpha=0.3, color='black');
life_expectancy_female_df.hist(ax=ax, alpha=0.8, color='pink');

Observations from the EDA

The EDA steps above revealed a few interesting things and informed the questions that I will answer in the next section. A few of the key observations are listed below:

  • The data does not have any null values
  • The data is relatively clean. The only cleaning I had to do was on the income_df where I replaced the 'k' for thousands with the numeric thousand (000's) to facilitate using the data in numeric operations.
  • The trend in the average GDP per capita income indicates a general consistent rise between 1671 and 2020. However, there are two noticeable dips during the 2008/2009 and 2019/2020 periods.
  • The trend in the life expectancy for the combined data set, males and females indicate a general rise for all three data sets for 4 out of the 5 decades under review (i.e., 1971 - 1980, 1981 - 1990, 2001 - 2010, 2011 - 2020).
  • The decade between the years 1991 and 2000 (into the early years of the 2000's) is the exception as it is characterised by falling life expectancy
  • The life expectancy of females seems to be generally higher than that of males consistently throughout the review period.

Questions

My analyisis will attempt to answer the following questions:

  1. Is there a relationship between the average GDP per capita income and the life expectancies (combines, male and female) for the SADC region?
  2. Is there a relationship between the GDP per capita income and the life expectancies (combines, male and female) for each of the 16 countries in the SADC region?
  3. Is the trend where female life expectancy is higher than the male life expectancy consistent for ALL 16 SADC countries for each of the years under review (1971 to 2020)?
  4. How do the figures for the GDP per capita income and the life expectancies for each of the countries compare to the average figures for the SADC region over the review period?

Each of these questions will be answered by means of visualisations in the section that follow.

Question 1: Is there a relationship between the average GDP per capita income and the life expectancies (combines, male and female) for the SADC region?

For this question, all the line graphs from the EDA section are collated in one visualisation so that comparisons can be made. The visualisation makes use of two different y-axes, one with the life expectancy in years and the other with the per capita income in dollars.

The lines each depict the average (mean) values for each of the years for all the SADC countries combined.

In [63]:
#Plot the average (mean) values for each of the years for all the SADC countries combined for each indicator.
fig,ax = plt.subplots(figsize=(16,5))
ax2=ax.twinx();
income_df.mean(axis=0).plot(ax=ax2, color='green', linestyle='dotted', label='income', marker = '+')
life_expectancy_all_df.mean(axis=0).plot(ax=ax, color='brown', label='all_life_expectancy', marker = 'o')
life_expectancy_male_df.mean(axis=0).plot(ax=ax, color='black', label='male_life_expectancy', marker = '.', alpha=0.3)
life_expectancy_female_df.mean(axis=0).plot(ax=ax, color='pink', label='female_life_expectancy', marker = '.', alpha=0.8)
plt.ylabel('per capita income in $', color='green')
ax.set_ylabel('life expectancy in years', color='brown')
ax.set_xlabel('Time period (years): from 1971 to 2020')
plt.title('SADC life expectancy and GDP per capita income over the years 1971 - 2020')
ax.legend(loc='center left');
ax2.legend(loc='center right');

The observations from the visual above seem to generally support the notion that as the average GDP per capita income for SADC rose during the years 1971 to 2020 it was accompanied by a rise in the life expectancies as well. This seems to be true for all the life expectancy indicators that we are tracking, i.e., combined, male and female expectancies.

One exception to this is during the years roughly in the 1991 to 2000 (and into the early 2000's) where per capita income is rising but the life expectancy indicators for that period are falling.

The other exception is around the years 2019/2020 where average per capita income dips sharply but the life expectancy indicators do not dip as sharply, the combined life expectancy figure does show signs of plateauing during this 2019/2020 but only so slightly.

On average, rising per capita income does seem to be related to the life expectancy indicators for the SADC countries. The few exceptions to this may be explained with other events that were happening in the macro-environment. This will be touched on in the conclusion section.

Question 2: Is there a relationship between the GDP per capita income and the life expectancies (combines, male and female) for each of the 16 countries in the SADC region?

This section follows the same approach as question 1, but this time I am looking at the indicators country by country (as opposed to taking the average for the entire SADC region)

In [64]:
#Defining a function that loops through the list of SADC countries and for each country produce a visualisation that
#plots the values for each indicator for the country.
def view_country(country):
    '''
    Takes a country name and shows the income and life expectancy graphs for the specified country
    country (string) - name of the country
    '''    
    
    fig,ax = plt.subplots(figsize=(16,5))
    ax2=ax.twinx();
    income_df[income_df.country == country].mean(axis=0).plot(ax=ax2, color='green', linestyle='dotted', label='income', marker = '+')
    life_expectancy_all_df[life_expectancy_all_df.country == country].mean(axis=0).plot(ax=ax, color='brown', label='all_life_expectancy', marker = 'o')
    life_expectancy_male_df[life_expectancy_male_df.country == country].mean(axis=0).plot(ax=ax, color='black', label='male_life_expectancy', marker = '.', alpha=0.3)
    life_expectancy_female_df[life_expectancy_female_df.country == country].mean(axis=0).plot(ax=ax, color='pink', label='female_life_expectancy', marker = '.', alpha=0.8)
    plt.ylabel('per capita income in $', color='green')
    ax.set_ylabel('life expectancy in years', color='brown')
    ax.set_xlabel('Time period (years): from 1971 to 2020')
    plt.title(country +' life expectancy and GDP per capita income over the years 1971 - 2020')
    ax.legend(loc='upper left');
    ax2.legend(loc='center right');
In [65]:
#Call the function to print the visualisation per country
for country in sadc_countries:
    view_country(country)

Looking at the visualisations country by country brings interesting insights into how the individual countries contrinuted to the average trend for SADC in question 1. It is worth noting that although the country by country trend analysis roughly shows the same trend as the average in most cases, there are some countries that exhibit the trend more severely than others, whereas other countries seem to be less impacted. There are also some countries that seem to be trending against the average trend in some indicators or for certain periods during the period under review. A few of these observations are noted below.

  • The dip in life expectancy indicators between observed during the years 1991 and early 2000's is more pronounced in countries like Botswana, Zimbabwe, Eswatini, South Africa, and Lesotho; while countries like Seychelles, Comoros, Mauritius, and Mozambique seem to be barely showing this dip.
  • The per capita income for countries like the Congo Democratic Republic and Madagascar show a falling trend over the entire period. Zimbabwe on the other hand had a sharp dip between 2000 and 2008 and then showed signs of recovery before plateauing in the early years of the decade starting 2011. South Africa on the other hand had it's dip during the decade starting 1981 into the early 1990s before showing sustained growth until a sharp decline in the years 2019/2020.

Question 3: Is the trend where female life expectancy is higher than the male life expectancy consistent for ALL 16 SADC countries for each of the years under review (1971 to 2020)?

From the visualisations in the first two questions it seems the female life expectancy indicator figures are generally higher than the male life expectancy figures. This questions investigates if this trend is indeed sustained for every country and every year under review.

The approach to this question is to melt the dataframes so that the years appear as row values under the 'year' column for each year for each country. The male life expectancy figure for each data point is then subtracted from the equivalent female life expectancy figure and where the difference is a positive number (confirming that the female life expectancy is higher), the difference cell is populated with a green bar. Where this is false the bar color will be red

In [66]:
#Melt the dataframes to have the Year columns run as values in the 'year' column
income_df_melt = pd.melt(income_df, id_vars='country', var_name='year', value_name='income_per_capita')
life_expectancy_all_df_melt = pd.melt(life_expectancy_all_df, id_vars='country', var_name='year', value_name='all_life_expectancy')
life_expectancy_male_df_melt = pd.melt(life_expectancy_male_df, id_vars='country', var_name='year', value_name='male_life_expectancy')
life_expectancy_female_df_melt = pd.melt(life_expectancy_female_df, id_vars='country', var_name='year', value_name='female_life_expectancy')
In [67]:
#Merge all four dataframes with the unique combination of country and year as the index
df_merge_1 = income_df_melt.merge(life_expectancy_all_df_melt, how='inner',on=['country', 'year'])
df_merge_2 = df_merge_1.merge(life_expectancy_male_df_melt, how='inner',on=['country', 'year'])
df_merge_3 = df_merge_2.merge(life_expectancy_female_df_melt, how='inner',on=['country', 'year'])
df_combined = df_merge_3.reset_index(drop=True).set_index(['country', 'year'])
In [68]:
#Add a column that calculates the difference between the Female life expectancy and the male life expectancy...
#...for each index (i.e. unique pairing of country and year)
#Where Female life expectancy is greater than Male life expectancy show a green bar, and a red bar where Male life expectancy is higher
#SPOILER ALERT: All bars are Green which means females generally lived longer in the SADC countries for the period under review

df_combined['diff_female_male_life_expectancy'] = (df_combined['female_life_expectancy'] - df_combined['male_life_expectancy'])
df_combined = df_combined.style.bar(subset=['diff_female_male_life_expectancy'], align='mid', color=['#d65f5f', '#5fba7d'])
In [69]:
df_combined
Out[69]:
income_per_capita all_life_expectancy male_life_expectancy female_life_expectancy diff_female_male_life_expectancy
country year
Angola 1971 3200 46.8 40 42.7 2.7
Botswana 1971 1700 57 51.2 57.5 6.3
Congo, Dem. Rep. 1971 2760 49.9 42.8 45.6 2.8
Comoros 1971 1890 51.7 44.5 47.8 3.3
Lesotho 1971 693 54.7 46.1 55 8.9
Madagascar 1971 2530 49.1 44.4 46.2 1.8
Mozambique 1971 507 48 40.4 42.8 2.4
Mauritius 1971 4760 64 60.8 65.9 5.1
Malawi 1971 907 42.7 39.9 40.5 0.6
Namibia 1971 7330 58.1 50.9 55.6 4.7
Eswatini 1971 2100 52 46.9 50.7 3.8
Seychelles 1971 5530 66.4 63.1 70.6 7.5
Tanzania 1971 1210 50.1 46 48.4 2.4
South Africa 1971 11200 57 50.8 55.7 4.9
Zambia 1971 2260 53.9 49.1 52 2.9
Zimbabwe 1971 3110 58.3 55.5 58.9 3.4
Angola 1972 3170 47 40.4 43.1 2.7
Botswana 1972 2080 57.5 51.7 58 6.3
Congo, Dem. Rep. 1972 2700 50 43.1 45.9 2.8
Comoros 1972 1920 52 44.9 48.3 3.4
Lesotho 1972 808 55 46.7 55.5 8.8
Madagascar 1972 2460 49.8 44.8 46.7 1.9
Mozambique 1972 520 48.3 40.7 43 2.3
Mauritius 1972 5000 63.7 60.7 66.3 5.6
Malawi 1972 970 43.5 40.4 41.1 0.7
Namibia 1972 7220 58.3 51.3 56.2 4.9
Eswatini 1972 2310 52.6 47.5 51.4 3.9
Seychelles 1972 5840 67.2 63.5 71 7.5
Tanzania 1972 1260 50.8 46.4 48.8 2.4
South Africa 1972 11100 57.3 51.4 56.2 4.8
Zambia 1972 2430 54.3 49.5 52.4 2.9
Zimbabwe 1972 3330 58.7 55.6 59.1 3.5
Angola 1973 3320 47.2 40.7 43.4 2.7
Botswana 1973 2440 58.1 52.2 58.6 6.4
Congo, Dem. Rep. 1973 2850 50.3 43.4 46.2 2.8
Comoros 1973 2030 52.3 45.3 48.7 3.4
Lesotho 1973 997 55.4 47.2 56.1 8.9
Madagascar 1973 2360 50.7 45.3 47.2 1.9
Mozambique 1973 532 48.7 41 43.3 2.3
Mauritius 1973 5440 63.9 60.7 66.8 6.1
Malawi 1973 1020 44.2 40.9 41.6 0.7
Namibia 1973 7250 58.4 51.8 56.8 5
Eswatini 1973 2340 53.2 48.3 52.1 3.8
Seychelles 1973 6380 67.8 64 71.3 7.3
Tanzania 1973 1270 51.5 46.8 49.2 2.4
South Africa 1973 11300 57.6 51.9 56.8 4.9
Zambia 1973 2370 54.6 49.9 52.8 2.9
Zimbabwe 1973 3410 58.8 55.8 59.2 3.4
Angola 1974 3170 47.4 41.1 43.8 2.7
Botswana 1974 2570 58.6 52.8 59.1 6.3
Congo, Dem. Rep. 1974 2860 50.7 43.6 46.4 2.8
Comoros 1974 2070 52.5 45.8 49.2 3.4
Lesotho 1974 1030 55.8 47.8 56.6 8.8
Madagascar 1974 2370 51.4 45.8 47.6 1.8
Mozambique 1974 552 49.2 41.4 43.5 2.1
Mauritius 1974 5700 65.1 60.8 67.4 6.6
Malawi 1974 1060 44.9 41.3 42.1 0.8
Namibia 1974 7500 58.6 52.3 57.5 5.2
Eswatini 1974 2720 53.7 49 52.8 3.8
Seychelles 1974 6460 68.3 64.4 71.7 7.3
Tanzania 1974 1280 52.1 47.3 49.7 2.4
South Africa 1974 11700 58 52.5 57.3 4.8
Zambia 1974 2510 54.9 50.2 53.2 3
Zimbabwe 1974 3460 59.1 56 59.4 3.4
Angola 1975 1990 47.5 41.4 44.2 2.8
Botswana 1975 2680 59.1 53.3 59.7 6.4
Congo, Dem. Rep. 1975 2650 50.9 43.8 46.6 2.8
Comoros 1975 2180 52.9 46.2 49.6 3.4
Lesotho 1975 924 56.2 48.4 57.3 8.9
Madagascar 1975 2360 52 46.2 48.1 1.9
Mozambique 1975 549 49.8 41.7 43.8 2.1
Mauritius 1975 5600 64 60.9 68.1 7.2
Malawi 1975 1060 45.5 41.7 42.6 0.9
Namibia 1975 7390 58.8 52.7 58.1 5.4
Eswatini 1975 3540 54.5 49.7 53.5 3.8
Seychelles 1975 6620 68.9 64.8 72.1 7.3
Tanzania 1975 1320 52.8 47.7 50.1 2.4
South Africa 1975 11700 58.3 53.1 57.9 4.8
Zambia 1975 2380 55.1 50.5 53.5 3
Zimbabwe 1975 3460 59.3 56.2 59.6 3.4
Angola 1976 1770 47.5 41.7 44.5 2.8
Botswana 1976 2860 59.6 53.9 60.3 6.4
Congo, Dem. Rep. 1976 2440 51.1 44 46.8 2.8
Comoros 1976 2160 53.2 46.7 50.1 3.4
Lesotho 1976 1050 56.6 49.1 57.9 8.8
Madagascar 1976 2250 52.6 46.6 48.5 1.9
Mozambique 1976 565 49.9 42 44 2
Mauritius 1976 6640 64.9 61.1 68.8 7.7
Malawi 1976 1070 46.2 42.1 43.1 1
Namibia 1976 7230 58.9 53.2 58.7 5.5
Eswatini 1976 3910 55.2 50.5 54.3 3.8
Seychelles 1976 7670 69.4 65.3 72.4 7.1
Tanzania 1976 1390 53.4 48 50.5 2.5
South Africa 1976 11700 58.6 53.7 58.4 4.7
Zambia 1976 2480 55.4 50.7 53.8 3.1
Zimbabwe 1976 3400 59 56.5 59.9 3.4
Angola 1977 1740 47.7 42 44.9 2.9
Botswana 1977 3040 59.9 54.4 60.8 6.4
Congo, Dem. Rep. 1977 2400 51.3 44.2 47 2.8
Comoros 1977 2100 53.5 47.2 50.6 3.4
Lesotho 1977 1200 57.1 49.8 58.6 8.8
Madagascar 1977 2260 53.5 47 48.9 1.9
Mozambique 1977 549 49.9 42.2 44.2 2
Mauritius 1977 6830 65 61.5 69.5 8
Malawi 1977 1090 46.8 42.5 43.6 1.1
Namibia 1977 7040 59 53.5 59.2 5.7
Eswatini 1977 3850 55.8 51.3 55.1 3.8
Seychelles 1977 8290 69.8 65.7 72.7 7
Tanzania 1977 1400 53.9 48.4 50.8 2.4
South Africa 1977 11400 58.8 54.3 59 4.7
Zambia 1977 2320 55.5 50.9 54 3.1
Zimbabwe 1977 3110 58.5 56.8 60.2 3.4
Angola 1978 1780 47.8 42.3 45.2 2.9
Botswana 1978 3260 60.3 54.9 61.4 6.5
Congo, Dem. Rep. 1978 2210 51.5 44.4 47.2 2.8
Comoros 1978 2140 53.8 47.8 51.1 3.3
Lesotho 1978 1400 57.7 50.5 59.3 8.8
Madagascar 1978 2160 54.1 47.4 49.3 1.9
Mozambique 1978 539 50.1 42.4 44.4 2
Mauritius 1978 6850 66.4 61.9 70.1 8.2
Malawi 1978 1150 47.5 42.9 44.1 1.2
Namibia 1978 7100 59.1 53.9 59.7 5.8
Eswatini 1978 3800 56.3 52.1 55.8 3.7
Seychelles 1978 9040 70.3 66.1 73 6.9
Tanzania 1978 1410 54.4 48.6 51.1 2.5
South Africa 1978 11500 59.1 54.8 59.5 4.7
Zambia 1978 2290 55.5 51.1 54.1 3
Zimbabwe 1978 3190 56.8 57.2 60.5 3.3
Angola 1979 1780 48 42.5 45.5 3
Botswana 1979 3580 60.6 55.4 61.9 6.5
Congo, Dem. Rep. 1979 2160 51.8 44.6 47.5 2.9
Comoros 1979 2290 54.1 48.4 51.7 3.3
Lesotho 1979 1210 58.1 51.1 59.9 8.8
Madagascar 1979 2320 54.6 47.8 49.7 1.9
Mozambique 1979 542 50.2 42.4 44.5 2.1
Mauritius 1979 6840 66.5 62.4 70.7 8.3
Malawi 1979 1160 48.3 43.3 44.5 1.2
Namibia 1979 7160 59.1 54.1 60.1 6
Eswatini 1979 3760 56.8 52.9 56.6 3.7
Seychelles 1979 10400 70.8 66.4 73.3 6.9
Tanzania 1979 1400 55 48.9 51.4 2.5
South Africa 1979 11700 59.3 55.3 60 4.7
Zambia 1979 2190 55.6 51.3 54.2 2.9
Zimbabwe 1979 3180 57.3 57.6 61 3.4
Angola 1980 1780 48.1 42.7 45.8 3.1
Botswana 1980 3870 61 55.9 62.4 6.5
Congo, Dem. Rep. 1980 2150 52.1 44.9 47.8 2.9
Comoros 1980 2420 54.5 49 52.3 3.3
Lesotho 1980 1150 58.5 51.8 60.6 8.8
Madagascar 1980 2300 54.9 48.1 50 1.9
Mozambique 1980 551 50.5 42.4 44.6 2.2
Mauritius 1980 5930 67.2 63 71.1 8.1
Malawi 1980 1130 48.8 43.6 45 1.4
Namibia 1980 7420 58.9 54.4 60.4 6
Eswatini 1980 3720 57.3 53.7 57.4 3.7
Seychelles 1980 10100 71.1 66.7 73.5 6.8
Tanzania 1980 1390 55.4 49 51.6 2.6
South Africa 1980 12100 59.5 55.9 60.5 4.6
Zambia 1980 2210 55.6 51.4 54.2 2.8
Zimbabwe 1980 3450 60.4 58 61.5 3.5
Angola 1981 1710 48.2 42.9 46 3.1
Botswana 1981 4170 61.3 56.3 62.9 6.6
Congo, Dem. Rep. 1981 2150 52.4 45.2 48 2.8
Comoros 1981 2500 54.9 49.7 52.9 3.2
Lesotho 1981 1130 58.8 52.4 61.2 8.8
Madagascar 1981 2050 55.1 48.3 50.3 2
Mozambique 1981 545 48 42.4 44.6 2.2
Mauritius 1981 6070 67.6 63.6 71.4 7.8
Malawi 1981 1040 49.2 43.8 45.4 1.6
Namibia 1981 7400 59.1 54.6 60.8 6.2
Eswatini 1981 3870 57.7 54.6 58.3 3.7
Seychelles 1981 9440 71.1 67 73.7 6.7
Tanzania 1981 1350 55.7 49.1 51.7 2.6
South Africa 1981 12500 59.8 56.4 61 4.6
Zambia 1981 2300 55.7 51.5 54.2 2.7
Zimbabwe 1981 3810 61 58.4 62 3.6
Angola 1982 1600 48.2 43.1 46.2 3.1
Botswana 1982 4370 61.6 56.7 63.4 6.7
Congo, Dem. Rep. 1982 2090 52.7 45.5 48.3 2.8
Comoros 1982 2660 55.3 50.3 53.6 3.3
Lesotho 1982 1140 59.2 53 61.8 8.8
Madagascar 1982 1970 54.5 48.5 50.5 2
Mozambique 1982 512 48.2 42.3 44.7 2.4
Mauritius 1982 6200 68.5 64 71.7 7.7
Malawi 1982 1010 49.5 44.1 45.8 1.7
Namibia 1982 7160 59.4 54.9 61.2 6.3
Eswatini 1982 3830 58.2 55.4 59.1 3.7
Seychelles 1982 9330 70.7 67.2 73.9 6.7
Tanzania 1982 1340 56 49.2 51.9 2.7
South Africa 1982 12100 60.2 57 61.5 4.5
Zambia 1982 2180 55.6 51.6 54.2 2.6
Zimbabwe 1982 3860 61.5 58.8 62.5 3.7
Angola 1983 1500 48.2 43.3 46.4 3.1
Botswana 1983 4730 62 57.1 63.8 6.7
Congo, Dem. Rep. 1983 2070 53 45.8 48.6 2.8
Comoros 1983 2780 55.6 51 54.2 3.2
Lesotho 1983 1020 59.4 53.5 62.3 8.8
Madagascar 1983 1950 54.7 48.6 50.6 2
Mozambique 1983 436 48.3 42.3 44.8 2.5
Mauritius 1983 6050 68.8 64.4 71.9 7.5
Malawi 1983 1020 49.8 44.2 46.2 2
Namibia 1983 6780 59.7 55.3 61.6 6.3
Eswatini 1983 3690 57.6 56.3 60 3.7
Seychelles 1983 9210 71.4 67.3 74.1 6.8
Tanzania 1983 1310 56.3 49.2 52 2.8
South Africa 1983 11600 60.5 57.5 62.1 4.6
Zambia 1983 2080 55.6 51.7 54.2 2.5
Zimbabwe 1983 3830 62 59.1 63 3.9
Angola 1984 1460 48.4 43.4 46.6 3.2
Botswana 1984 5190 62.2 57.4 64.3 6.9
Congo, Dem. Rep. 1984 2140 53.4 46 48.9 2.9
Comoros 1984 2890 56.2 51.6 54.8 3.2
Lesotho 1984 1020 59.7 54 62.8 8.8
Madagascar 1984 1840 54.7 48.6 50.7 2.1
Mozambique 1984 443 45.8 42.3 45 2.7
Mauritius 1984 6150 69.1 64.6 72.1 7.5
Malawi 1984 1030 50 44.3 46.6 2.3
Namibia 1984 6550 60 55.8 62 6.2
Eswatini 1984 3670 59 57.1 60.8 3.7
Seychelles 1984 9970 71.5 67.4 74.3 6.9
Tanzania 1984 1310 56.5 49.2 52.2 3
South Africa 1984 11900 61.3 58.1 62.7 4.6
Zambia 1984 2030 55.4 51.7 54 2.3
Zimbabwe 1984 3670 62.5 59.2 63.4 4.2
Angola 1985 1440 48.6 43.5 46.8 3.3
Botswana 1985 5460 62.5 57.6 64.6 7
Congo, Dem. Rep. 1985 2100 53.7 46.3 49.1 2.8
Comoros 1985 2940 56.5 52.3 55.4 3.1
Lesotho 1985 1090 60 54.4 63.3 8.9
Madagascar 1985 1840 54.8 48.7 50.8 2.1
Mozambique 1985 400 45.8 42.3 45.2 2.9
Mauritius 1985 6390 68.8 64.7 72.2 7.5
Malawi 1985 1080 50.2 44.3 46.9 2.6
Namibia 1985 6390 60.2 56.4 62.5 6.1
Eswatini 1985 3620 59.6 57.9 61.5 3.6
Seychelles 1985 11000 71.3 67.4 74.5 7.1
Tanzania 1985 1280 56.6 49.2 52.3 3.1
South Africa 1985 11500 61.9 58.7 63.3 4.6
Zambia 1985 2020 55.1 51.6 53.9 2.3
Zimbabwe 1985 3830 63 59.1 63.7 4.6
Angola 1986 1290 48.6 43.6 46.9 3.3
Botswana 1986 5640 62.6 57.6 64.7 7.1
Congo, Dem. Rep. 1986 2140 53.9 46.5 49.4 2.9
Comoros 1986 2990 57 52.9 56 3.1
Lesotho 1986 1090 60.3 54.7 63.6 8.9
Madagascar 1986 1810 54.4 48.8 50.9 2.1
Mozambique 1986 405 47.6 42.5 45.5 3
Mauritius 1986 6820 69.2 64.8 72.4 7.6
Malawi 1986 1040 50.1 44.3 47.2 2.9
Namibia 1986 6550 60.7 56.9 63 6.1
Eswatini 1986 4150 60 58.6 62.2 3.6
Seychelles 1986 11200 71.1 67.3 74.6 7.3
Tanzania 1986 1300 56.6 49.1 52.4 3.3
South Africa 1986 11300 62.1 59.2 64 4.8
Zambia 1986 1980 54.8 51.3 53.5 2.2
Zimbabwe 1986 3850 63.4 58.8 63.7 4.9
Angola 1987 1410 48.6 43.7 47 3.3
Botswana 1987 5870 62.7 57.3 64.7 7.4
Congo, Dem. Rep. 1987 2150 54 46.8 49.6 2.8
Comoros 1987 3030 57.4 53.5 56.6 3.1
Lesotho 1987 1120 60.4 54.9 63.9 9
Madagascar 1987 1810 54.4 48.9 51.1 2.2
Mozambique 1987 425 47.7 42.7 45.8 3.1
Mauritius 1987 7300 69.5 64.9 72.6 7.7
Malawi 1987 1010 50.1 44.3 47.5 3.2
Namibia 1987 6630 61.1 57.5 63.5 6
Eswatini 1987 4420 60.5 59.2 62.8 3.6
Seychelles 1987 11700 71 67.2 74.8 7.6
Tanzania 1987 1330 56.4 49 52.4 3.4
South Africa 1987 11300 62.8 59.7 64.7 5
Zambia 1987 1990 54.2 50.8 53 2.2
Zimbabwe 1987 3710 63.6 58.2 63.5 5.3
Angola 1988 1580 48.6 43.7 47 3.3
Botswana 1988 6340 62.2 56.8 64.4 7.6
Congo, Dem. Rep. 1988 2100 54.2 47.1 49.9 2.8
Comoros 1988 3100 57.8 54 57.2 3.2
Lesotho 1988 1210 60.6 55 64 9
Madagascar 1988 1800 54.9 49.1 51.3 2.2
Mozambique 1988 446 51 42.9 46.1 3.2
Mauritius 1988 7570 69.7 65.1 72.8 7.7
Malawi 1988 968 50 44.2 47.8 3.6
Namibia 1988 6390 61.1 58 64 6
Eswatini 1988 4550 61 59.7 63.3 3.6
Seychelles 1988 12300 70.9 67 75 8
Tanzania 1988 1360 56.2 48.8 52.4 3.6
South Africa 1988 11500 63.5 60.1 65.3 5.2
Zambia 1988 2070 53.9 50.1 52.3 2.2
Zimbabwe 1988 3970 63.6 57.4 63.1 5.7
Angola 1989 1580 49.4 43.7 47.1 3.4
Botswana 1989 8140 62.1 56.1 63.9 7.8
Congo, Dem. Rep. 1989 2010 54.3 47.3 50.2 2.9
Comoros 1989 3050 58 54.6 57.7 3.1
Lesotho 1989 1280 60.7 54.9 64.1 9.2
Madagascar 1989 1820 55.4 49.4 51.7 2.3
Mozambique 1989 473 51.4 43.3 46.5 3.2
Mauritius 1989 7680 69.9 65.4 73 7.6
Malawi 1989 958 49.7 44.1 48 3.9
Namibia 1989 6180 62.3 58.4 64.3 5.9
Eswatini 1989 4800 61.4 60 63.6 3.6
Seychelles 1989 13600 70.5 66.8 75.2 8.4
Tanzania 1989 1370 55.8 48.6 52.3 3.7
South Africa 1989 11500 63.8 60.4 65.9 5.5
Zambia 1989 2050 53 49.1 51.5 2.4
Zimbabwe 1989 4150 63.2 56.3 62.4 6.1
Angola 1990 1610 49.7 43.6 47.1 3.5
Botswana 1990 8430 61.3 55.3 63.1 7.8
Congo, Dem. Rep. 1990 1820 54.3 47.6 50.5 2.9
Comoros 1990 3060 58.5 55.1 58.3 3.2
Lesotho 1990 1340 60.6 54.8 63.9 9.1
Madagascar 1990 1850 55.9 49.8 52.2 2.4
Mozambique 1990 471 51.6 43.6 46.8 3.2
Mauritius 1990 7990 70.2 65.7 73.3 7.6
Malawi 1990 953 49.4 44 48.2 4.2
Namibia 1990 5950 62.4 58.6 64.5 5.9
Eswatini 1990 5100 61.8 60 63.8 3.8
Seychelles 1990 14600 70.6 66.6 75.4 8.8
Tanzania 1990 1390 55.3 48.4 52.1 3.7
South Africa 1990 11300 64.2 60.5 66.3 5.8
Zambia 1990 2190 51.9 48 50.5 2.5
Zimbabwe 1990 4170 62.4 55.1 61.4 6.3
Angola 1991 1670 50.3 43.6 47.1 3.5
Botswana 1991 8790 60.2 54.3 62.2 7.9
Congo, Dem. Rep. 1991 1610 54.2 47.8 50.6 2.8
Comoros 1991 2810 58.7 55.6 58.8 3.2
Lesotho 1991 1400 60.5 54.4 63.6 9.2
Madagascar 1991 1690 56.3 50.4 52.9 2.5
Mozambique 1991 482 51.8 44 47.2 3.2
Mauritius 1991 8250 70.8 66 73.6 7.6
Malawi 1991 1020 49 44 48.4 4.4
Namibia 1991 6240 62.4 58.5 64.4 5.9
Eswatini 1991 5050 61.9 59.6 63.6 4
Seychelles 1991 14800 70.8 66.4 75.6 9.2
Tanzania 1991 1380 54.8 48.1 51.9 3.8
South Africa 1991 10900 64.3 60.4 66.6 6.2
Zambia 1991 2130 50.7 46.8 49.5 2.7
Zimbabwe 1991 4300 61.4 53.7 60.1 6.4
Angola 1992 1620 50.3 43.5 47.1 3.6
Botswana 1992 8800 58.6 53.4 61.2 7.8
Congo, Dem. Rep. 1992 1380 54.1 47.9 50.7 2.8
Comoros 1992 2960 59 56.1 59.2 3.1
Lesotho 1992 1460 60 53.9 63 9.1
Madagascar 1992 1660 56.6 51 53.6 2.6
Mozambique 1992 437 52.1 44.3 47.6 3.3
Mauritius 1992 8680 70.8 66.3 73.8 7.5
Malawi 1992 932 48.5 43.8 48.5 4.7
Namibia 1992 6510 62.3 58.2 64 5.8
Eswatini 1992 5080 61.9 58.8 63.2 4.4
Seychelles 1992 15800 70.8 66.2 75.8 9.6
Tanzania 1992 1340 54.1 47.8 51.7 3.9
South Africa 1992 10400 63.6 60.1 66.6 6.5
Zambia 1992 2050 49.6 45.5 48.5 3
Zimbabwe 1992 3830 59.8 52.2 58.6 6.4
Angola 1993 1260 49 43.4 47.1 3.7
Botswana 1993 8740 56.7 52.5 60.1 7.6
Congo, Dem. Rep. 1993 1150 53.9 47.9 50.7 2.8
Comoros 1993 2970 59.4 56.5 59.6 3.1
Lesotho 1993 1480 59.4 53.2 62.3 9.1
Madagascar 1993 1640 56.6 51.8 54.3 2.5
Mozambique 1993 464 52.4 44.6 48.1 3.5
Mauritius 1993 9010 70.7 66.5 74 7.5
Malawi 1993 1020 47.8 43.7 48.6 4.9
Namibia 1993 6250 62 57.6 63.4 5.8
Eswatini 1993 5120 61.8 57.7 62.6 4.9
Seychelles 1993 16500 70.7 66 75.9 9.9
Tanzania 1993 1310 53.7 47.6 51.5 3.9
South Africa 1993 10300 64.3 59.6 66.4 6.8
Zambia 1993 2130 48.7 44.3 47.6 3.3
Zimbabwe 1993 3800 58 50.7 57 6.3
Angola 1994 1300 50.3 43.4 47.1 3.7
Botswana 1994 8840 54.6 51.7 59 7.3
Congo, Dem. Rep. 1994 1070 53.7 47.8 50.6 2.8
Comoros 1994 2730 59.6 56.9 60 3.1
Lesotho 1994 1540 58.3 52.4 61.3 8.9
Madagascar 1994 1590 57.3 52.6 55.1 2.5
Mozambique 1994 475 52.6 44.9 48.5 3.6
Mauritius 1994 9250 70.8 66.6 74.1 7.5
Malawi 1994 912 46.9 43.6 48.6 5
Namibia 1994 6210 61 56.7 62.4 5.7
Eswatini 1994 5120 60.9 56.2 61.6 5.4
Seychelles 1994 15900 70.6 65.9 76.1 10.2
Tanzania 1994 1290 53.2 47.4 51.4 4
South Africa 1994 10300 63.4 59 66 7
Zambia 1994 1900 47.7 43.3 46.8 3.5
Zimbabwe 1994 4090 56.2 49.1 55.3 6.2
Angola 1995 1530 51.2 43.4 47.2 3.8
Botswana 1995 9230 52.5 51 58 7
Congo, Dem. Rep. 1995 1040 53.6 47.7 50.5 2.8
Comoros 1995 2750 59.8 57.2 60.3 3.1
Lesotho 1995 1560 57.1 51.3 60 8.7
Madagascar 1995 1570 57.9 53.4 55.9 2.5
Mozambique 1995 469 52.6 45.2 48.9 3.7
Mauritius 1995 9570 71.1 66.7 74.1 7.4
Malawi 1995 1050 46.3 43.4 48.5 5.1
Namibia 1995 6310 60.3 55.6 61.2 5.6
Eswatini 1995 5260 59.3 54.4 60.4 6
Seychelles 1995 15500 71 65.9 76.1 10.2
Tanzania 1995 1290 52.9 47.4 51.3 3.9
South Africa 1995 10400 62.9 58.2 65.2 7
Zambia 1995 1910 46.8 42.4 46.2 3.8
Zimbabwe 1995 4040 54.2 47.7 53.6 5.9
Angola 1996 1780 51.7 43.4 47.4 4
Botswana 1996 9540 50.6 50.4 56.9 6.5
Congo, Dem. Rep. 1996 1000 52 47.6 50.5 2.9
Comoros 1996 2640 59.2 57.4 60.5 3.1
Lesotho 1996 1610 55.8 50.2 58.5 8.3
Madagascar 1996 1550 58.2 54.2 56.7 2.5
Mozambique 1996 506 52.6 45.5 49.4 3.9
Mauritius 1996 10000 71.2 66.8 74.2 7.4
Malawi 1996 1110 46 43.3 48.4 5.1
Namibia 1996 6380 59.2 54.4 59.9 5.5
Eswatini 1996 5350 57.2 52.4 58.9 6.5
Seychelles 1996 16100 71.1 65.9 76.2 10.3
Tanzania 1996 1320 52.7 47.4 51.3 3.9
South Africa 1996 10700 61.8 57.2 64.3 7.1
Zambia 1996 1980 46.1 41.8 45.8 4
Zimbabwe 1996 4410 52.5 46.4 52 5.6
Angola 1997 1960 51.6 43.6 47.6 4
Botswana 1997 10100 49 49.8 55.9 6.1
Congo, Dem. Rep. 1997 921 53.1 47.6 50.5 2.9
Comoros 1997 2680 60.1 57.6 60.7 3.1
Lesotho 1997 1640 54.1 48.9 56.7 7.8
Madagascar 1997 1560 58.4 55 57.5 2.5
Mozambique 1997 548 52.6 45.7 49.8 4.1
Mauritius 1997 10400 71.1 66.9 74.3 7.4
Malawi 1997 1130 45.7 43.1 48.2 5.1
Namibia 1997 6510 58 53.1 58.4 5.3
Eswatini 1997 5420 55.4 50.3 57 6.7
Seychelles 1997 17800 71.2 66 76.2 10.2
Tanzania 1997 1330 52.6 47.6 51.5 3.9
South Africa 1997 10800 59.5 56.2 63.1 6.9
Zambia 1997 2000 45.7 41.5 45.6 4.1
Zimbabwe 1997 4490 51.2 45.2 50.5 5.3
Angola 1998 2110 50.6 43.8 47.9 4.1
Botswana 1998 9910 47.6 49.2 54.9 5.7
Congo, Dem. Rep. 1998 886 53.3 47.8 50.7 2.9
Comoros 1998 2640 60.3 57.7 60.9 3.2
Lesotho 1998 1640 52 47.6 54.8 7.2
Madagascar 1998 1570 58.6 55.8 58.3 2.5
Mozambique 1998 588 52.6 46.1 50.2 4.1
Mauritius 1998 11000 71.8 67.1 74.4 7.3
Malawi 1998 1140 45.5 43 47.9 4.9
Namibia 1998 6590 56.5 51.9 56.9 5
Eswatini 1998 5460 53 48.2 54.9 6.7
Seychelles 1998 18900 71.6 66.1 76.2 10.1
Tanzania 1998 1350 52.6 47.9 51.8 3.9
South Africa 1998 10600 58 55.1 61.8 6.7
Zambia 1998 1930 45.5 41.4 45.6 4.2
Zimbabwe 1998 4580 49.7 44.1 49.1 5
Angola 1999 2210 51.9 44.1 48.3 4.2
Botswana 1999 10600 46.4 48.7 54 5.3
Congo, Dem. Rep. 1999 828 53.5 48.1 51 2.9
Comoros 1999 2620 61.2 57.8 61 3.2
Lesotho 1999 1630 50.2 46.2 52.8 6.6
Madagascar 1999 1590 59.1 56.6 59 2.4
Mozambique 1999 640 52.4 46.4 50.6 4.2
Mauritius 1999 11100 71.9 67.4 74.6 7.2
Malawi 1999 1140 45.5 42.8 47.7 4.9
Namibia 1999 6690 55.2 50.8 55.6 4.8
Eswatini 1999 5550 50.7 46.2 52.8 6.6
Seychelles 1999 18900 71.8 66.3 76.2 9.9
Tanzania 1999 1380 53 48.4 52.2 3.8
South Africa 1999 10700 57.2 54.1 60.5 6.4
Zambia 1999 1970 45.3 41.5 45.8 4.3
Zimbabwe 1999 4510 48.5 43.3 47.9 4.6
Angola 2000 2340 52.8 44.5 48.7 4.2
Botswana 2000 10600 45.5 48.2 53.3 5.1
Congo, Dem. Rep. 2000 752 53.9 48.6 51.5 2.9
Comoros 2000 2830 61.4 57.9 61.1 3.2
Lesotho 2000 1690 48.5 44.9 50.9 6
Madagascar 2000 1610 59.5 57.3 59.7 2.4
Mozambique 2000 631 52.2 46.8 51 4.2
Mauritius 2000 11900 72.3 67.8 74.8 7
Malawi 2000 1130 45.9 42.8 47.6 4.8
Namibia 2000 6810 53.9 49.9 54.4 4.5
Eswatini 2000 5580 48.3 44.5 50.7 6.2
Seychelles 2000 19000 72.2 66.6 76.2 9.6
Tanzania 2000 1410 53.4 48.9 52.7 3.8
South Africa 2000 11000 55.6 53.1 59.2 6.1
Zambia 2000 1990 45.3 41.9 46.2 4.3
Zimbabwe 2000 4350 47.5 42.6 46.9 4.3
Angola 2001 2510 53.4 45 49.3 4.3
Botswana 2001 10500 45 48 52.8 4.8
Congo, Dem. Rep. 2001 716 54.2 49.2 52.1 2.9
Comoros 2001 2830 62.2 58 61.1 3.1
Lesotho 2001 1740 47.3 43.7 49.1 5.4
Madagascar 2001 1660 60 57.9 60.3 2.4
Mozambique 2001 687 52.3 47.1 51.4 4.3
Mauritius 2001 12200 72.6 68.2 75 6.8
Malawi 2001 1050 46.3 43 47.7 4.7
Namibia 2001 6780 53.1 49.3 53.4 4.1
Eswatini 2001 5590 46.3 43.1 48.7 5.6
Seychelles 2001 18500 72.3 66.9 76.2 9.3
Tanzania 2001 1460 54 49.6 53.3 3.7
South Africa 2001 11200 54.8 52.3 58 5.7
Zambia 2001 2040 45.2 42.5 46.8 4.3
Zimbabwe 2001 4400 47 42.1 46.1 4
Angola 2002 2930 54.5 45.6 50 4.4
Botswana 2002 10900 44.9 48 52.7 4.7
Congo, Dem. Rep. 2002 715 54.4 49.9 52.8 2.9
Comoros 2002 2830 62.5 58 61.2 3.2
Lesotho 2002 1760 46.1 42.6 47.6 5
Madagascar 2002 1410 60 58.5 61 2.5
Mozambique 2002 729 52.2 47.5 51.7 4.2
Mauritius 2002 12300 72.7 68.5 75.2 6.7
Malawi 2002 1040 46.7 43.3 48 4.7
Namibia 2002 6990 52.5 48.9 52.7 3.8
Eswatini 2002 5810 45 42 47.1 5.1
Seychelles 2002 18200 72.1 67.2 76.2 9
Tanzania 2002 1520 54.6 50.4 53.9 3.5
South Africa 2002 11400 53.7 51.7 57.1 5.4
Zambia 2002 2080 45.9 43.3 47.6 4.3
Zimbabwe 2002 4000 46.6 41.7 45.5 3.8
Angola 2003 3150 55.1 46.3 50.8 4.5
Botswana 2003 11200 45.7 48.3 52.9 4.6
Congo, Dem. Rep. 2003 733 54.8 50.7 53.6 2.9
Comoros 2003 2820 63.3 58.2 61.4 3.2
Lesotho 2003 1850 45 41.7 46.3 4.6
Madagascar 2003 1500 60.3 59 61.5 2.5
Mozambique 2003 757 52.1 47.7 52 4.3
Mauritius 2003 12900 72.9 68.8 75.5 6.7
Malawi 2003 1070 47.4 43.8 48.5 4.7
Namibia 2003 7180 52.1 48.8 52.3 3.5
Eswatini 2003 6010 43.9 41.2 45.8 4.6
Seychelles 2003 17300 71.9 67.5 76.2 8.7
Tanzania 2003 1580 55.3 51.2 54.6 3.4
South Africa 2003 11600 52.7 51.3 56.4 5.1
Zambia 2003 2170 46.6 44.2 48.5 4.3
Zimbabwe 2003 3310 46.4 41.4 45.1 3.7
Angola 2004 3560 55.5 47 51.7 4.7
Botswana 2004 11300 47.5 49 53.5 4.5
Congo, Dem. Rep. 2004 758 55.5 51.5 54.4 2.9
Comoros 2004 2800 63.8 58.3 61.5 3.2
Lesotho 2004 1890 44.2 41 45.5 4.5
Madagascar 2004 1530 60.6 59.5 62 2.5
Mozambique 2004 793 52.1 48 52.2 4.2
Mauritius 2004 13400 73.2 69 75.6 6.6
Malawi 2004 1100 48 44.5 49.3 4.8
Namibia 2004 7940 52.4 48.9 52.2 3.3
Eswatini 2004 6210 43.1 40.7 44.9 4.2
Seychelles 2004 16900 72 67.8 76.3 8.5
Tanzania 2004 1650 55.9 52 55.2 3.2
South Africa 2004 12000 52.2 51.1 56 4.9
Zambia 2004 2260 47.7 45.3 49.5 4.2
Zimbabwe 2004 3110 46.4 41.4 44.8 3.4
Angola 2005 4310 56.4 47.9 52.6 4.7
Botswana 2005 11600 49.9 50 54.4 4.4
Congo, Dem. Rep. 2005 779 56.1 52.2 55.1 2.9
Comoros 2005 2820 64.1 58.5 61.8 3.3
Lesotho 2005 1970 43.9 40.6 45.1 4.5
Madagascar 2005 1560 61.1 59.9 62.5 2.6
Mozambique 2005 821 52.1 48.2 52.5 4.3
Mauritius 2005 13600 73.2 69.2 75.8 6.6
Malawi 2005 1110 48.9 45.5 50.4 4.9
Namibia 2005 8020 53.5 49.3 52.4 3.1
Eswatini 2005 6550 43.3 40.7 44.6 3.9
Seychelles 2005 18300 72.2 68.2 76.3 8.1
Tanzania 2005 1720 56.5 52.8 55.9 3.1
South Africa 2005 12500 52.1 51.1 55.9 4.8
Zambia 2005 2360 48.7 46.4 50.7 4.3
Zimbabwe 2005 2920 46.8 41.6 45 3.4
Angola 2006 5610 57 48.8 53.6 4.8
Botswana 2006 12300 51.4 51.3 55.7 4.4
Congo, Dem. Rep. 2006 794 56.7 53 55.8 2.8
Comoros 2006 2820 64.4 58.8 62 3.2
Lesotho 2006 2060 44.2 40.4 45.1 4.7
Madagascar 2006 1600 61.5 60.3 63 2.7
Mozambique 2006 876 52.3 48.4 52.8 4.4
Mauritius 2006 14200 73.3 69.3 76 6.7
Malawi 2006 1130 50.1 46.7 51.7 5
Namibia 2006 8440 55.3 50 53 3
Eswatini 2006 6910 44.1 40.9 44.8 3.9
Seychelles 2006 19600 72.4 68.4 76.4 8
Tanzania 2006 1780 57.3 53.6 56.6 3
South Africa 2006 13000 52.3 51.4 56.3 4.9
Zambia 2006 2480 50 47.6 51.9 4.3
Zimbabwe 2006 2800 47.4 42.2 45.5 3.3
Angola 2007 6960 58 49.8 54.7 4.9
Botswana 2007 13100 53.4 52.8 57.3 4.5
Congo, Dem. Rep. 2007 817 57.1 53.7 56.5 2.8
Comoros 2007 2780 64.5 59.2 62.4 3.2
Lesotho 2007 2150 45.1 40.6 45.5 4.9
Madagascar 2007 1640 61.9 60.7 63.5 2.8
Mozambique 2007 918 52.8 48.7 53.1 4.4
Mauritius 2007 14900 73.8 69.4 76.2 6.8
Malawi 2007 1200 51.6 48.1 53.2 5.1
Namibia 2007 8740 57.4 51 54.1 3.1
Eswatini 2007 7170 44.9 41.5 45.4 3.9
Seychelles 2007 21600 72.5 68.7 76.5 7.8
Tanzania 2007 1850 58.5 54.4 57.4 3
South Africa 2007 13600 53.2 52 57 5
Zambia 2007 2620 51.9 49 53.3 4.3
Zimbabwe 2007 2670 48.2 43.3 46.6 3.3
Angola 2008 7850 58.8 50.8 55.8 5
Botswana 2008 13600 54.7 54.4 59 4.6
Congo, Dem. Rep. 2008 840 57.5 54.3 57.2 2.9
Comoros 2008 2820 65.3 59.5 62.7 3.2
Lesotho 2008 2270 46.3 41 46.1 5.1
Madagascar 2008 1700 62.1 61.1 63.9 2.8
Mozambique 2008 958 53.4 49 53.5 4.5
Mauritius 2008 15700 73.7 69.6 76.5 6.9
Malawi 2008 1260 53.4 49.6 54.9 5.3
Namibia 2008 8810 59.2 52.1 55.4 3.3
Eswatini 2008 7180 45.5 42.3 46.4 4.1
Seychelles 2008 20700 72.8 68.9 76.6 7.7
Tanzania 2008 1900 59.6 55.2 58.3 3.1
South Africa 2008 13800 54.3 52.9 58 5.1
Zambia 2008 2750 54.1 50.5 54.8 4.3
Zimbabwe 2008 2180 48.9 44.8 48.1 3.3
Angola 2009 7760 59.5 51.9 57 5.1
Botswana 2009 12300 56.1 56.1 60.8 4.7
Congo, Dem. Rep. 2009 836 58 54.9 57.8 2.9
Comoros 2009 2840 64.8 59.9 63.1 3.2
Lesotho 2009 2240 47.4 41.6 47 5.4
Madagascar 2009 1590 62.3 61.5 64.4 2.9
Mozambique 2009 991 53.5 49.3 54 4.7
Mauritius 2009 16100 73.8 69.8 76.7 6.9
Malawi 2009 1320 55 51.2 56.7 5.5
Namibia 2009 8670 60.1 53.3 56.9 3.6
Eswatini 2009 7240 46.4 43.2 47.6 4.4
Seychelles 2009 20300 72.9 69.1 76.7 7.6
Tanzania 2009 1940 60.3 56 59.3 3.3
South Africa 2009 13400 55.7 53.8 59.2 5.4
Zambia 2009 2920 55.7 52 56.3 4.3
Zimbabwe 2009 2410 50.2 46.7 50.1 3.4
Angola 2010 7690 60.2 52.8 58 5.2
Botswana 2010 13100 57 57.8 62.6 4.8
Congo, Dem. Rep. 2010 866 58.8 55.5 58.3 2.8
Comoros 2010 2880 65.9 60.3 63.5 3.2
Lesotho 2010 2350 48.1 42.4 48.1 5.7
Madagascar 2010 1550 62.6 61.9 64.9 3
Mozambique 2010 1030 53.5 49.8 54.7 4.9
Mauritius 2010 16800 74.2 70 77 7
Malawi 2010 1380 56.3 52.8 58.4 5.6
Namibia 2010 9030 60.9 54.6 58.6 4
Eswatini 2010 7460 48 44.4 49.2 4.8
Seychelles 2010 21000 73.2 69.3 76.8 7.5
Tanzania 2010 2010 61.1 56.8 60.4 3.6
South Africa 2010 13600 57.1 54.9 60.6 5.7
Zambia 2010 3130 56.5 53.5 57.8 4.3
Zimbabwe 2010 2850 52.3 48.9 52.2 3.3
Angola 2011 7680 60.8 53.8 59.1 5.3
Botswana 2011 13700 57.8 59.4 64.3 4.9
Congo, Dem. Rep. 2011 895 59.4 56 58.9 2.9
Comoros 2011 2930 65.7 60.6 63.9 3.3
Lesotho 2011 2450 48.2 43.4 49.3 5.9
Madagascar 2011 1540 62.9 62.4 65.3 2.9
Mozambique 2011 1070 53.6 50.5 55.5 5
Mauritius 2011 17500 74.6 70.3 77.3 7
Malawi 2011 1400 57.7 54.3 60 5.7
Namibia 2011 9330 61.9 55.8 60.2 4.4
Eswatini 2011 7580 49.8 45.7 51 5.3
Seychelles 2011 23200 73.4 69.4 76.9 7.5
Tanzania 2011 2100 61.7 57.6 61.4 3.8
South Africa 2011 13800 58.7 56 61.9 5.9
Zambia 2011 3200 57.5 54.9 59.3 4.4
Zimbabwe 2011 3200 54.4 51.1 54.5 3.4
Angola 2012 8040 61.4 54.7 60 5.3
Botswana 2012 14200 58.6 60.9 66 5.1
Congo, Dem. Rep. 2012 927 60.1 56.5 59.3 2.8
Comoros 2012 2950 66.3 61 64.3 3.3
Lesotho 2012 2590 47.9 44.5 50.5 6
Madagascar 2012 1540 63.2 62.8 65.8 3
Mozambique 2012 1120 53.6 51.3 56.5 5.2
Mauritius 2012 18000 74.7 70.5 77.5 7
Malawi 2012 1390 59.3 55.7 61.6 5.9
Namibia 2012 9630 62.9 56.9 61.7 4.8
Eswatini 2012 7930 50.9 47.1 53 5.9
Seychelles 2012 23300 73.4 69.5 77 7.5
Tanzania 2012 2130 62.7 58.5 62.5 4
South Africa 2012 13900 60.1 57 63.2 6.2
Zambia 2012 3340 58.5 56.2 60.7 4.5
Zimbabwe 2012 3680 56 53.2 56.6 3.4
Angola 2013 8140 62.1 55.4 60.8 5.4
Botswana 2013 15600 59.4 62.3 67.5 5.2
Congo, Dem. Rep. 2013 972 60.9 56.9 59.8 2.9
Comoros 2013 3010 66.7 61.3 64.6 3.3
Lesotho 2013 2610 47.9 45.7 51.8 6.1
Madagascar 2013 1530 63.5 63.2 66.2 3
Mozambique 2013 1170 54 52.2 57.6 5.4
Mauritius 2013 18600 75.1 70.7 77.8 7.1
Malawi 2013 1420 60.6 57 62.9 5.9
Namibia 2013 9990 63.3 57.8 63 5.2
Eswatini 2013 8180 52.1 48.7 55.2 6.5
Seychelles 2013 24200 73.5 69.6 77 7.4
Tanzania 2013 2210 63.7 59.3 63.5 4.2
South Africa 2013 14100 61.3 58 64.4 6.4
Zambia 2013 3400 59.5 57.4 62 4.6
Zimbabwe 2013 3680 57.2 55.1 58.4 3.3
Angola 2014 8240 63 56.1 61.6 5.5
Botswana 2014 16000 60.1 63.5 68.9 5.4
Congo, Dem. Rep. 2014 1030 61.8 57.4 60.3 2.9
Comoros 2014 3000 67.2 61.6 64.9 3.3
Lesotho 2014 2640 47.9 46.9 53.1 6.2
Madagascar 2014 1540 63.8 63.6 66.7 3.1
Mozambique 2014 1220 54.9 53.2 58.7 5.5
Mauritius 2014 19200 75 70.9 77.9 7
Malawi 2014 1460 61.5 58 64.1 6.1
Namibia 2014 10400 63.3 58.6 64.1 5.5
Eswatini 2014 8190 53.5 50.3 57.3 7
Seychelles 2014 24900 73.5 69.6 77.1 7.5
Tanzania 2014 2290 64.5 60.2 64.3 4.1
South Africa 2014 14000 61.8 58.7 65.4 6.7
Zambia 2014 3450 60.2 58.3 63.2 4.9
Zimbabwe 2014 3700 58 56.6 59.9 3.3
Angola 2015 8040 63.5 56.7 62.2 5.5
Botswana 2015 14900 60.6 64.5 70 5.5
Congo, Dem. Rep. 2015 1070 62.6 57.8 60.7 2.9
Comoros 2015 2960 67.5 61.8 65.2 3.4
Lesotho 2015 2700 48.5 48 54.3 6.3
Madagascar 2015 1550 64.2 64 67.1 3.1
Mozambique 2015 1260 55.8 54.3 59.9 5.6
Mauritius 2015 19900 75.2 71.1 78.1 7
Malawi 2015 1460 62 58.9 65 6.1
Namibia 2015 10700 63.5 59.2 64.9 5.7
Eswatini 2015 8310 54.9 51.8 59.4 7.6
Seychelles 2015 25600 73.4 69.7 77.1 7.4
Tanzania 2015 2350 65.1 61.1 65.1 4
South Africa 2015 14000 62.3 59.3 66.1 6.8
Zambia 2015 3440 60.8 59.1 64.3 5.2
Zimbabwe 2015 3710 58.6 57.8 61 3.2
Angola 2016 7570 63.9 57.2 62.8 5.6
Botswana 2016 15700 61.2 65.2 70.9 5.7
Congo, Dem. Rep. 2016 1060 63.3 58.2 61.1 2.9
Comoros 2016 2990 67.9 62 65.4 3.4
Lesotho 2016 2780 49.6 49 55.3 6.3
Madagascar 2016 1570 64.5 64.4 67.5 3.1
Mozambique 2016 1270 56.6 55.3 61.1 5.8
Mauritius 2016 20600 75.2 71.2 78.2 7
Malawi 2016 1450 62.6 59.6 65.8 6.2
Namibia 2016 10500 63.7 59.6 65.4 5.8
Eswatini 2016 8320 56.1 53.2 61.2 8
Seychelles 2016 26400 73.3 69.7 77.2 7.5
Tanzania 2016 2440 65.5 61.9 65.8 3.9
South Africa 2016 13900 62.7 59.8 66.7 6.9
Zambia 2016 3470 61.4 59.7 65.2 5.5
Zimbabwe 2016 3680 59.2 58.6 61.7 3.1
Angola 2017 7310 64.2 57.7 63.3 5.6
Botswana 2017 15900 61.5 65.8 71.6 5.8
Congo, Dem. Rep. 2017 1060 63.9 58.5 61.5 3
Comoros 2017 3030 68.2 62.2 65.7 3.5
Lesotho 2017 2670 50.8 49.8 56.2 6.4
Madagascar 2017 1580 64.8 64.7 67.9 3.2
Mozambique 2017 1280 57.3 56.3 62.1 5.8
Mauritius 2017 21400 75.3 71.4 78.3 6.9
Malawi 2017 1470 63.4 60.2 66.4 6.2
Namibia 2017 10200 64.1 60 65.8 5.8
Eswatini 2017 8410 57 54.3 62.8 8.5
Seychelles 2017 27300 73.4 69.8 77.3 7.5
Tanzania 2017 2530 66 62.6 66.3 3.7
South Africa 2017 13900 63.2 60.2 67.1 6.9
Zambia 2017 3490 61.9 60.2 65.9 5.7
Zimbabwe 2017 3800 59.9 59.1 62.2 3.1
Angola 2018 6930 64.6 58.1 63.7 5.6
Botswana 2018 16200 61.8 66.2 72 5.8
Congo, Dem. Rep. 2018 1090 64.7 58.9 61.9 3
Comoros 2018 3070 68.5 62.4 65.9 3.5
Lesotho 2018 2620 51.4 50.6 57 6.4
Madagascar 2018 1590 65.1 65.1 68.3 3.2
Mozambique 2018 1290 57.9 57.1 63 5.9
Mauritius 2018 22200 75.3 71.5 78.3 6.8
Malawi 2018 1500 64.1 60.7 66.9 6.2
Namibia 2018 10100 64.7 60.4 66.2 5.8
Eswatini 2018 8520 57.8 55.3 64 8.7
Seychelles 2018 27500 73.5 69.8 77.3 7.5
Tanzania 2018 2590 66.7 63.2 66.8 3.6
South Africa 2018 13900 64.4 60.5 67.4 6.9
Zambia 2018 3520 62.5 60.5 66.4 5.9
Zimbabwe 2018 3920 60.6 59.5 62.6 3.1
Angola 2019 6670 65.1 58.4 64 5.6
Botswana 2019 16300 62.3 66.5 72.4 5.9
Congo, Dem. Rep. 2019 1100 65 59.1 62.2 3.1
Comoros 2019 3060 68.7 62.6 66.1 3.5
Lesotho 2019 2580 51.8 51.2 57.6 6.4
Madagascar 2019 1620 65.5 65.4 68.7 3.3
Mozambique 2019 1280 58.4 57.8 63.7 5.9
Mauritius 2019 22900 75.5 71.7 78.5 6.8
Malawi 2019 1540 64.7 61.1 67.4 6.3
Namibia 2019 9810 65.2 60.7 66.5 5.8
Eswatini 2019 8650 58.3 56 64.8 8.8
Seychelles 2019 27600 73.6 69.9 77.4 7.5
Tanzania 2019 2660 67.2 63.6 67.2 3.6
South Africa 2019 13700 65.1 60.7 67.7 7
Zambia 2019 3470 63.2 60.8 66.9 6.1
Zimbabwe 2019 3630 61 59.8 62.9 3.1
Angola 2020 6120 65.2 58.7 64.4 5.7
Botswana 2020 14600 61.6 66.7 72.6 5.9
Congo, Dem. Rep. 2020 1080 65.2 59.4 62.5 3.1
Comoros 2020 2970 68.8 62.8 66.3 3.5
Lesotho 2020 2410 52 51.7 58.1 6.4
Madagascar 2020 1450 65.6 65.7 69.1 3.4
Mozambique 2020 1230 58.4 58.3 64.2 5.9
Mauritius 2020 19500 75.5 71.8 78.6 6.8
Malawi 2020 1510 64.8 61.5 67.8 6.3
Namibia 2020 8820 65.4 61 66.9 5.9
Eswatini 2020 8400 58 56.5 65.4 8.9
Seychelles 2020 25300 73.6 70 77.5 7.5
Tanzania 2020 2710 67.3 64 67.6 3.6
South Africa 2020 12600 64.3 61 67.9 6.9
Zambia 2020 3270 63.1 61.1 67.2 6.1
Zimbabwe 2020 3370 60.7 60 63.2 3.2

The above visualisation confirms that for the period under review, and for every country and every year, the female life expectancy is higher than the male life expectancy.

Question 4: How do the figures for the GDP per capita income and the life expectancies for each of the countries compare to the average figures for the SADC region over the review period?

For this last question, I take a deep dive into each of the indicators. I do this by plotting on a single visualisation, the values for each of the SADC countries across the years 1971 to 2020 for the specific indicator. I also include the line for the average SADC figure for the indicator so that it is instantly clear how each country is performing against the SADC average for each indicator for each year.

In [70]:
#Plotting the per capita income for each SADC country and the average across the years 1971 to 2020
fig,ax = plt.subplots(figsize=(24,12));
income_df.set_index('country').T.plot(ax=ax);
income_df.mean(axis=0).plot(ax=ax, color='black', label='SADC average', marker = 'o', linestyle="dotted");
ax.legend(loc='upper left',fontsize='x-large');
plt.title('SADC per capita income for each country over the years 1971 - 2020');
ax.set_xlabel('Time period (years): from 1971 to 2020');
In [71]:
#Plotting the per capita income for each SADC country and the average across the years 1971 to 2020
#This version shows only the countries that are tracking below the average line
fig,ax = plt.subplots(figsize=(24,12));
income_df.set_index('country').drop(['Seychelles', 'Mauritius', 'Botswana', 'South Africa', 'Namibia', 'Eswatini']).T.plot(ax=ax);
income_df.mean(axis=0).plot(ax=ax, color='black', label='SADC average', marker = 'o', linestyle="dotted");
ax.legend(loc='upper left',fontsize='x-large');
plt.title('SADC per capita income for each low-income country (below the SADC average only) over the years 1971 - 2020');
ax.set_xlabel('Time period (years): from 1971 to 2020');
In [72]:
#Plotting the combined life expectancy for each SADC country and the average across the years 1971 to 2020
fig,ax = plt.subplots(figsize=(24,12));
life_expectancy_all_df.set_index('country').T.plot(ax=ax);
life_expectancy_all_df.mean(axis=0).plot(ax=ax, color='black', label='SADC average', marker = 'o', linestyle="dotted");
ax.legend(loc='upper left',fontsize='x-large');
plt.title('SADC combined life expectancy for each country over the years 1971 - 2020');
ax.set_xlabel('Time period (years): from 1971 to 2020');
In [73]:
#Plotting the male life expectancy for each SADC country and the average across the years 1971 to 2020
fig,ax = plt.subplots(figsize=(24,12));
life_expectancy_male_df.set_index('country').T.plot(ax=ax);
life_expectancy_male_df.mean(axis=0).plot(ax=ax, color='black', label='SADC average', marker = 'o', linestyle="dotted");
ax.legend(loc='upper left',fontsize='x-large');
plt.title('SADC male life expectancy for each country over the years 1971 - 2020');
ax.set_xlabel('Time period (years): from 1971 to 2020');
In [74]:
#Plotting the female life expectancy for each SADC country and the average across the years 1971 to 2020
fig,ax = plt.subplots(figsize=(24,12));
life_expectancy_female_df.set_index('country').T.plot(ax=ax);
life_expectancy_female_df.mean(axis=0).plot(ax=ax, color='black', label='SADC average', marker = 'o', linestyle="dotted");
plt.title('SADC female life expectancy for each country over the years 1971 - 2020');
ax.legend(loc='upper left',fontsize='x-large');

Conclusions

Generally, the data does seem to indicate that the SADC countries have generally been on a rising trend with both the per capita income growth and life expectancy. Another conclusion that the data seems to support is that life expectancy and income seem to rise in tandem.

However, there are a few exceptions to these generalizations. Firstly, Madagascar and the Congo, Democratic Republic seem to have falling per capita incomes for the period under review. Zimbabwe is another country that has had sustained periods of decline on this indicator as well, although there is a period of resurgence post-2008 before stalling again around 2012. It will be interesting to dig deeper into these case studies and see what could have caused this.

The dips in the SADC average on the per capita indicator are curiously coinciding with the global financial crisis of 2008 and the global COVID-19 pandemic of 2019/2020. Again, additional data and analysis into this observation may lead to interesting results.

On the life expectancy trend, I was drawn to sustained decline in the life expectancy during the period 1991 to 2000. UNAIDS statistics indicate that this is the period that HIV & AIDS deaths were most prevalent in Sub-Saharan Africa. I would be interested in pursuing future studies into how this may explain this trend in life expectancy.

In conclusion, I believe this project opens up the options of what can be achieved when various indicators are tracked and compared to each other to see if there are any interesting observations. These observations may then lead to more detailed and scientific studies into how certain phenomenon can be explained with data and trends. This information can play a major role in informing governance, policy setting and focus of funding.

Limitations

A few things that I feel may limit the effectiveness in generalizing the observations and conclusions from this project is that the data did not take into account the relative sizes of the populations of each of the SADC countries. An indicator may have different significance when used on a population of a few million people as opposed to tens of millions. Having this context on the population size may add value to the analyis.

Another limitation is that the data gives cross-cutting averages across the entire population which may imply that income, for example, is evenly distributed. This may artificially raise the average income for a country where income is unevenly distributed with only a few earning disproportionately high income.

Conclusions

Tip: Finally, summarize your findings and the results that have been performed in relation to the question(s) provided at the beginning of the analysis. Summarize the results accurately, and point out where additional research can be done or where additional information could be useful.

Tip: Make sure that you are clear with regards to the limitations of your exploration. You should have at least 1 limitation explained clearly.

Tip: If you haven't done any statistical tests, do not imply any statistical conclusions. And make sure you avoid implying causation from correlation!

Tip: Once you are satisfied with your work here, check over your report to make sure that it is satisfies all the areas of the rubric (found on the project submission page at the end of the lesson). You should also probably remove all of the "Tips" like this one so that the presentation is as polished as possible.

Submitting your Project

Tip: Before you submit your project, you need to create a .html or .pdf version of this notebook in the workspace here. To do that, run the code cell below. If it worked correctly, you should get a return code of 0, and you should see the generated .html file in the workspace directory (click on the orange Jupyter icon in the upper left).

Tip: Alternatively, you can download this report as .html via the File > Download as submenu, and then manually upload it into the workspace directory by clicking on the orange Jupyter icon in the upper left, then using the Upload button.

Tip: Once you've done this, you can submit your project by clicking on the "Submit Project" button in the lower right here. This will create and submit a zip file with this .ipynb doc and the .html or .pdf version you created. Congratulations!

In [75]:
from subprocess import call
call(['python', '-m', 'nbconvert', 'Investigate_a_Dataset.ipynb'])
Out[75]:
255